3 CONNECTIVITY 
ancl CREATIVITY 


in times of CONFLICT 


Usability and UX evaluation of an 
online interactive virtual learning environment: 
a case study of Wales’ virtual hospital 


Fatma Layas’, Yolanda Rendon-Guerrero’, Tim Stokes’, Sean Jenkins? 


Assistive Technologies Innovation Centre (ATIC), University of Wales Trinity Saint David, UK 
f.layas@uwtsd.ac.uk 
y.rendon-guerrero@uwtsd.ac.uk 
tim.stokes@uwtsd.ac.uk 
sean.jenkins@uwtsd.ac.uk 


Abstract 

Clinical placements are an essential component of the ed- 
ucation provision for students of medicine and other health 
professions. However, opportunities to achieve learning out- 
comes cannot be consistent across students due to the very 
nature of their exposure to different patients in different 
timeframes and settings. In addition, the unpredictability of 
attendance of patients and the impact of the COVID-19 pan- 
demic has resulted in few opportunities to experience more 
than one point in a patient journey. An innovative online virtu- 
al environment named Wales’ Virtual Hospital (WVH) was de- 
veloped using agile software development and User-Centred 
Design approach. This research paper presents the compre- 
hensive usability and user experience (UX) studies that were 
conducted to evaluate all aspects of WVH by end-users and 
experts. The main contribution of this research is in the case 
study of evaluating a newly developed innovative online virtu- 
al environment, where behavioural and subjective feedback 
were collected to test the usability and the effectiveness of 
the learning experience. For this paper not all the outcomes 
of the evaluation process are reported, instead a key out- 
come of each iterative cycle is given as an example. The eval- 
uation approach developed and used in this research could 
be adopted by other researchers to evaluate similar systems. 
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Introduction 

Medical learning requires a multimodal approach, with the 
need to offer students up-to-date evidence-based knowledge 
and the explanation of processes and key procedures (Philippe 
et al, 2020). Alongside scientific theory and the use of multi- 
media or online-resources, a core part of supporting medical 
students involves practical elements, such as placements in 
clinical settings. Clinical placements are considered an es- 
sential component of the education provision for students of 
medicine and other health professions. It enables the vital and 
unique experience of applying textbook knowledge to ‘real’ pa- 
tients and the demands of an often-evolving clinical situation. 


However, in clinical placements opportunities to achieve 
learning outcomes cannot be consistent across students 
due to the very nature of their exposure to different patients 
in different timeframes and settings. Hence, not all students 
will have the chance to experience a variety of specialisms, 
departments, and see the vast number of presenting com- 
plaints and patients (Life Sciences Hub Wales, 2022). As 
a result, students often see only one point of the patient's 
journey. Furthermore, the COVID-19 pandemic has brought a 
more critical challenge to experiential learning with face-to- 
face interaction becoming limited (Pears et al., 2020; Chan 
et al, 2021). Simulation can be used to augment clinical 
placements (Schiza et al., 2020, Macnamara et al., 2021). This 
learning technique provides strong engagement and offers 
students many technical skills. It offers the chance to learn 
from situational awareness, making judgements, and imple- 
menting practical processes (e.g, fitting a catheter) without 
affecting the safety of a real patient, and the opportunity to 
receive feedback and a debrief on their performance (Chao 
et al., 2022). 


Immersive technologies such as Virtual Reality (VR) have 
received a lot of positive attention in the fields of medical 
education, with evidence to show that it creates realistic 
and interactive simulations; supporting the transmission of 
knowledge; instilling emotional engagement; role expecta- 
tion and learning by doing (Dubovi, 2022). This should com- 
plement other forms of teaching and training and not be used 
as a substitution (Bruno, et al., 2020). 


However, creating medical scenarios, such as 3D-modelled 
wards featuring virtual patients and colleagues using com- 
puter-generated environments, can be quite expensive to 
produce, especially when creating multiple scenarios for dif- 
ferent types of clinical situations. Using 360° video, which is 
sometimes referred to as VR because it can be viewed ina VR 
Headset, gives the students an omnidirectional field of view 
simply by moving their head to look around, providing a pas- 
sive sense of immersion (Snelson & Hsu, 2020). The limita- 
tion is that they cannot interact in the same way as the com- 
puter-generated environments where they can walk around, 
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interact with objects, and feel more present in the experience 
(Huang, et al., 2020; Witmer & Singer, 1998). 


To address some of the limitations of clinical placements 
and to develop an inexpensive VR content creation platform 
that would allow clinicians to generate bespoke content, an 
innovative online virtual environment named Wales’ Virtu- 
al Hospital (WVH) was developed. The design of the virtual 
environment allows clinicians or academics to create three 
types of interactive learning experience framed around a pa- 
tient presentation in three formats: 360° still environment, 
360° video environment, and fully immersive 360° VR envi- 
ronment. The goal is to build a library of medical case studies 
from a range of specialisms and to deliver more experiential 
learning of healthcare, with opportunities for interactivity in 
the form of answering questions as the content progresses. 
Clinicians or academics can use an online toolkit named the 
“Creator Mode” (CM) to create the different types of interac- 
tive learning experiences. This can be done by recording 360° 
video content which is currently very inexpensive, available 
to a consumer market, and easy to set-up and capture (Har- 
rington et al, 2018). Students would be able to access the 
"Viewer Mode” (VM) portal to view the content and interact 
with it on their mobile, computer, or VR Head Mounted Dis- 
play (HMD). As part of the WVH system, students would also 
be able to answer key questions as the scenario progressed, 
allowing them to make judgement calls at different stages 
through a graphical interface, which would appear within 
the 360° environment. Clinicians or academics can view stu- 
dents’ engagement data using “Data Mode” (DM) toolkit (e.g, 
number of correct or wrong answers). 


The development of WVH was carried out by integrating Agile 
software development approach and User-Centred Design 
approach (UCD). This resulted in more frequent usability eval- 
uation iterations and a systematic way to examine and con- 
firm end-user needs (Jurca, et al, 2014). Research shows that 
iterative evaluation and refinement cycles are essential to de- 
velop an educational intervention (Sandars & Lafferty, 2010). 
As the WVH system relies on collaboration between the ac- 
ademics, recording and uploading 360° videos and creating 
the interactive content and then the students engaging with 
it, itis important to make the system user-friendly, with intu- 
itive functionality, for both types of users (Fisher and Wright, 
2010). This would encourage its adoption into the course 
pedagogy and ensure learning opportunities are effective and 
optimised. In this paper, we present the comprehensive usa- 
bility and UX studies that were conducted to evaluate all as- 
pects of WVH system by end-users and experts. The aim was 
to ensure that the system design adhere to design principles 
and meet users’ needs, across the WVH system modes, and 
making the delivery of the interactive virtual learning content 
more streamlined and engaging for the users. 


Evaluation Method 

Figure 1 illustrates the experimental approach that was 
adopted to critically evaluate the usability of the different 
types of modes and interactive learning experiences created 
using the WVH system, and to measure the effectiveness of 
these learning experiences. An iterative development cycle 
and testing with both experts and end-users were conducted 
while working closely with the WVH development team. The 


expert evaluation technique is typically conducted by profes- 
sionals who have a high level of expertise in a particular field 
or subject matter (Ghaoui, 2005), in the case of this project 
Human-Computer Interaction and Design specialists. Experts 
used their knowledge and skills to assess the usability of the 
system and user learning experience. Whereas end-user eval- 
uation technique involves collecting feedback from actual 
users of a product or service (Ghaoui, 2005). This approach 
allowed for a more practical and realistic assessment of WVH 
system, as it is based on the experiences and needs of the 
users (i.e, students, lecturers, and clinicians). 
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Figure 1. Evaluation Approach 


Expert-based Evaluation: 

Three expert evaluators evaluated the system's three differ- 
ent modes with the different level of interactions. This eval- 
uation process was iterative starting with early conceptual 
prototypes and finishing with a high-fidelity prototype. Two 
expert evaluation techniques were followed, this included: 

» Cognitive walkthrough (CW): This rigorous expert anal- 
ysis technique was used to check through the system 
design and logic of steps in user interaction (Lewis & 
Wharton, 1997). The focus of this technique was on 
evaluating the learnability of the system from the per- 
spective of new or infrequent users. The evaluation 
was structured around three design principles: visibil- 
ity, affordance, and feedback (Donald, 2013). During 
the evaluation process the expert evaluators went 
through the user tasks provided by the development 
team and discussed the four key questions cited by 
Wharton and his colleagues (1994). 

» Heuristic Evaluation (HE): This usability engineering 
technique allowed the expert evaluators to go through 
the system design looking for usability problems, 
guided by Jakob Nielsen’s standard usability heuris- 
tics (Nielsen and Molich, 1990; Nielsen 1994) and 
visual-design principles (Gordon, 2020). At the end of 
the session evaluators rated the identified usability 
problems using the severity rating scales for impact 
by Nielsen (1994). The severity ratings created a pri- 
ority list for the development team to work on to im- 
prove the system. Nielsen's standard usability heuris- 
tics were chosen as they are relevant when evaluating 
the different modes of interaction (Joyce, 2021) and 
for educational systems (Mohamed & Jaafar, 2010). 


The expert-based evaluation techniques were regarded as a 
first pass of evaluation to identify as many usability problems 
as possible. This was followed by user-based evaluation to fo- 
cus the evaluation further. 
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User-based Evaluation: 

A total of 12 medical students evaluated the VM using the dif- 
ferent levels of interactions. Individual testing sessions were 
conducted at ATiC’s laboratory on the high-fidelity prototype 
of the system with five students (VR mode) and remotely 
with seven students (other modes). The CM and DM of the 
system were evaluated by five professionals (clinicians and 
academics). Research shows that 85% of usability problems 
can be identified with five participants (Asarbakhsh & San- 
dras, 2013). 


Task scenario-based sessions using thinking aloud proto- 
col: Participants were invited to complete a series of tasks 
related to the key activities they need to complete to use the 
system. Participants were provided with scenarios to give 
them an explanation and context (Dumas & Redish, 1999). 
As participants move through the system to complete the 
tasks, they were asked to verbalise their thoughts, feelings, 
and opinions. 


Behavioural observation: To avoid the observer effect (Bla- 
lock & Blalock, 1982; Bloombaum 1983), the user-based 
evaluation sessions were video recorded using Noldus Viso 
system and screen capturing software for tracing and re- 
cording participants’ actions and navigation. Allowing the 
researchers to analyse the participants’ system interaction 
retrospectively. Observations were made on the key met- 
rics of Effectiveness (were participants able to complete the 
tasks with a high degree of accuracy), Efficiency (how fast 
can participants complete a task) and Errors (how many er- 
rors do participants make and how easy it is to recover from 
those errors). To allow for a more visual presentation of the 
user interaction (Andrade, 2018), Tobii eye tracking was used 
to capture users’ unconscious behavior, preference, and to 
understand their decision-making. 


Post-session interviews: Semi-structured interview sessions 
aimed to collect more detailed feedback from participants 
on the following aspects: 

» Likelihood of use: thoughts on the likelihood of them- 
selves and other students using the system. 

» Content and learning experience: the quality of the 
educational content available and what could be add- 
ed; how effective and efficient this type of experience 
on learning; thoughts on the feedback they get from 
interacting with the system; and finally explore if the 
multiple-choice question is the best way to test stu- 
dents’ knowledge and learning. 

» Utility: does the system offer the functions that 
end-users need. 

» Overall experience and usability: aesthetics; typogra- 
phy; learnability (ease of learning); ease of use; mem- 
orability (ease of remembering), and overall satisfac- 
tion. 


Post- session online questionnaire: The online questionnaire 
consisted of three sections: 

» System Usability Scale (SUS): a simple and reliable 
standard 10 item questionnaire with 5-point Likert 
scale used to collect participants’ subjective feed- 
back. The SUS was chosen as it is a well-researched 
and widely used to evaluate similar systems (Brooke, 


2013; Orfanou, Tselios, & Katsanos, 2015; Renaut, et al. 
2006). 

» Look and feel of the design, consisted of four state- 
ments with 5-point Likert scale, which investigated 
participants’ thoughts on different aspects of the de- 
sign. 

» Satisfaction: a statement with 5-point Likert scale 
about how satisfied participants are with the overall 
experience using the system. 


Implementation & Results: 

For this paper, an example of each iterative evaluation study 
will be discussed to highlight a key finding of that evaluation 
study and to illustrate how the evaluation method present- 
ed in this paper was implemented. Hence, not all the detailed 
feedback from the expert and user evaluation which has 
been shared with the development team is discussed here. 


The first round of the iterative process was carried out with 
early WVH prototype of the VM using CW technique, the ob- 
jective of this study was to evaluate two scenarios using the 
different level of interaction. The evaluators walked through 
the system thoroughly inspecting the two scenarios several 
times and completing a series of tasks. The outcome of each 
task was presented in the format illustrated in Table 1. A rep- 
resentative task of this study was to locate and enter the bay 
number O6. Once the participant was in the bay, they were 
required to check some important information about the pa- 
tient (e.g., patient history, ECG; Figure 2). 


Table 1. The outcome of a cognitive walkthrough 


CW questions CW Answers 


Will users try to achieve the right Yes. Users will be able to attempt to enter the bay and find the key 

effect? patient information. 

Will users notice that the correct No. Not everyone will look around naturally. Users should be 

action is available? provided with information on how to navigate the 360 
environment and what actions will take place all around them or 
what interactive elements will appear. 


Will users associate the correct No. Important interface components, or actionable items should 

action with the result they’re trying not be placed too far from each other. 

to achieve? 

After the action is performed, will Yes. When the users click on the white dot to enter the bay the user 

users see that progress is made is taken to inside the bay. Every time the user click on the 

toward the goal? actionable items inside the bay the user is presented with the 
related information. 


Figure 2. Screen shots of the WVH viewer mode in a Shift Scenario 


All the problems encountered in this study were categorised 
under four themes: (1) navigation; (2) interactivity; (3) feed- 
back given to users; (4) visual communication. The prob- 
lems were shared with the development team with a list of 
recommendations to improve on the design. The improved 
prototype was then tested with medical students using the 
user-based evaluation method discussed previously. 
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In this study participants evaluated a scenario using Oculus 
Quest VR headset. All participants said they could see them- 
selves and other students using the system. Overall, partici- 
pants found the experience engaging and fun. The immersive 
environment as participants noted gave them the experience 
of ‘being there’, which supports memory and practice-based 
learning. All participants commented on how it may benefit 
the way they learn best - by being in the role and in the hospi- 
tal (such as on a clinical work-placement). They could see po- 
tential as to how this platform could support practice-based 
learning - or used as a library for extending their knowledge. 
They also said it may better develop their experiences for ar- 
eas where they may not have had the opportunity first-hand. 
Two participants thought the video quality was not ‘crisp’, and 
that they felt they were ‘floating’ within the VR environment. 
However, this did not present a large problem towards overall 
UX. Unfortunately, none of the participants were able to com- 
plete all the tasks as the system kept crashing before reach- 
ing the end of the scenario. However, if we considered the 
number of tasks they completed before the system crashes, 
then all participants were able to complete all the tasks with 
a high degree of accuracy. On average efficiency level was 14 
minutes (SD=0.7), and no errors were made (getting a ques- 
tion wrong was not counted as an error). 


In the follow-up interviews, three participants said that they 
did not realise that answers were behind them, with one par- 
ticipant wondering whether they could be brought forward, 
but then became undecided because they recognised that 
having to search the environment for answers suits the im- 
mersive format. Participants suggested that having a fixed 
number of answer options would encourage them to look 
around for them. Participants commented on the quality of 
the production of the scenarios (e.g., quality of acting skills) 
which could be improved based on the content created by 
clinicians. Generally, participants found the structure of the 
scenarios very useful to test their knowledge and learn from 
any mistakes. Participants found the multiple-choice format 
useful for both learning and testing. 


In this version of the prototype, SUS mean score was 78 
(Grade B, Good) with a standard deviation (SD) of 6.2. Partic- 
ipants thought the system offered the functions that they 
need, and they were happy with its look and feel. Overall, all 
participants were very satisfied with the system. 


Table 2. The outcome of heuristic evaluation 


Heuristic Ratings Problems encountered 


Visibility of system status It is not clear which of the text fields are optional and which are 
compulsory. The appropriate feedback is presented, however, 
the display time of the error message is too short to read. 
‘Preview Image’ does not indicate upload image. Consider clearer 
labelling. 
Unable to upload a video making it hard to complete the task of 
creating a marked scenario. 
Inconsistency in labelling, when selecting an image from the 
1 “Images Library” the dialog box labelled ‘Select Media’ and then 
you are presented with ‘Select Images’. 
You are allowed to select two images as scenario cover even 
1 though you only need one. Users should be constrained from 
selecting more than one image. 


2 


Match between system and 
the real world 1 


User control and freedom 


Consistency and standards 


Error prevention 


The iterative process of evaluating each design cycle con- 
tinued with another round of expert evaluation of the re-de- 
signed VM, and the newly developed CM and DM. For each 


mode a series of tasks were tested. For the purpose of this 
paper, one example of a representative task from evaluating 
the CM will be used to illustrate the process (See Table 2). The 
task was to create a marked scenario, adding a stopping point 
at 30 seconds with one correct and two wrong answers. The 
evaluator should then move and place these answers in a lo- 
cation in the 360° environment. The outcome of the CM eval- 
uation uncovered some violation of the design principles. As 
shown in Table 2, one of the encountered problems prevent- 
ed the evaluators from completing the task. In contrast, both 
the VM and DM insights were mostly positive, where most of 
Nielsen's Heuristics were adhered to. After fixing the prob- 
lems identified by the experts another round of user-based 
evaluation was conducted. Where medical students evaluat- 
ed the VM, and professionals evaluated the CM and DM. 


The updated version of the VM scored 84 (SD=13) (Grade 
A, Excellent) on SUS an improvement from the last round 
of evaluation. The system, as participants noted, was us- 
er-friendly, easy to access with clear graphic design, naviga- 
tion, and interactivity. However, the audio quality during part 
of the scenario was reported as “poor” by some participants. 
Answers placed spatially around the scene were still causing 
some confusion to several participants. Participants were 
generally satisfied with the learning experience. 


CM and DM were evaluated by clinicians and academics. In 
the CM, participants were tasked to create new learning con- 
tent using the platform. The task involved building a 360° 
virtual scenario: inputting information, uploading videos and 
images, and then creating questions and answers and placing 
them within the 360° virtual space. All users had no prior ex- 
perience and were still able to complete the task effectively, 
with some commenting that they would have no problems 
reprising the task now they had completed it once, adding 
that an initial demonstration video would have better pre- 
pared them. Two key challenges users encountered were (1) 
understanding how to move (or look) around the 360° con- 
tent within the preview windows; (2) the choice of symbols 
and their placement in the interface. 


Users began by setting up their scenario content by inputting 
basic information and uploading media. Following typing a ti- 
tle and description, many users were hesitant regarding two 
buttons, “Show to Users” and “3D Video", before progressing. 
This can be seen in the eye tracking data in Figure 3, where 
they fixated on these elements overall. It was not clear to us- 


Figure 3. A heat-map of scenario information page 
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ers that these buttons were in fact switches, and by clicking 
them it would toggle on/off different options for the scenario. 


Following uploading 360° video content, users were present- 
ed with a window titled “Camera Centre” displaying a preview 
image of the video with a symbol of a target in the centre. 
Upon hovering, the cursor would change from an arrow to 
a hand icon, which indicated to users they could interact or 
move this element. All users tried to click and drag the tar- 
get symbol first to move the 360° video, which had no effect, 
rather than clicking the background or the image to move the 
perspective. This can be seen in eye tracking cata in Figure 4. 


Figure 4. A heat-map of scenario media page 


In the final screen, users can input questions and answers into 
their 360° scenario at any moment during the playback of vid- 
eo. There were specific challenges when placing answer box- 
es within the 360° environment. Each answer box had three 
symbols, a target, a pencil, and a tick or cross to denote if it was 
created as a right or wrong answer. Users assumed the pencil 
would allow them to ‘edit’ and move the answer box, instead 
this sent their cursor to the text entry box on the right. Upon 
clicking the target, it changed to a green computer disk icon, this 
would save the location of the answer in the 360° environment. 
However, it wasn’t clear that users had to now move the back- 
ground video by clicking and dragging, and instead users tried to 
drag the answer box. Once this process was finally worked out, 
users tried to click the green tick (representing ‘correct answer’) 
instead of the disk symbol. The eye-tracking heat-map reveals a 
definite focus on the green tick symbol (Figure 5). 


CM and DM scored 69 (SD=10) and 71 (SD=18) on SUS re- 
spectively (Grade B, Good). Overall participants were some- 


Nicks Scenario 


@ Drag the background video to reposition 
‘s your answer quacton 


Figure 5. A heat-map of the final page where users can add questions and answers 


what satisfied with both modes, with all participants sug- 
gesting that having a demo instruction video for first time 
users would have increased their satisfaction rate. 

At the end of each design and evaluation cycle, all project 
partners and stakeholders were invited for an informal evalua- 
tion testing session. The outcome of these sessions feedback 
to the next design cycle with all the data collected via expert 
and user evaluation. 


Discussion and Recommendations 

Conducting the evaluation in this thorough and rigorous man- 
ner allowed for a more comprehensive and well-rounded un- 
derstanding of the designed system and led to better deci- 
sion-making and improvements to the system. 

The paper offers insights on how to evaluate an interac- 
tive educational system using the different interactions levels. 
Designers and project teams should take in consideration the 
following: 

» It is crucial to start validating design ideas at the ear- 
ly design stage and continue evaluating the system 
throughout the whole development process. 

» The findings from the expert-based and user-based 
evaluation complemented each other as they provid- 
ed different perspectives. The expert-based evaluation 
can be regarded as a first pass of evaluation to identify 
as many usability, design, and technical problems as 
possible. While the user-based evaluation highlighted 
user-experience issues and areas where the system 
did not meet the needs of the users. 

» By combining the two approaches and collecting both 
subjective and behavioral data, it is possible to validate 
and confirm findings from both approaches. This helps 
ensure that the findings of the evaluation process are 
accurate and reliable. 


The case study of evaluating the newly developed innovative 
online virtual environment contributes to research in this field 
by demonstrating how this evaluation methodology, where it- 
erative collection of behavioural and subjective feedback is 
undertaken, can be used to test the usability and effectiveness 
of similar systems. However, there are some limitations of this 
research that need to be addressed, especially regarding evalu- 
ating the effectiveness of the learning experience. This research 
has only collected subjective feedback from students. Further 
research is planned to investigate the effect of VR simulation 
teaching prior to in-person simulation training. The research will 
involve 300 medical students using the Randomised Control Tri- 
al (RCT) method. In addition, there was no comparison between 
different levels of interaction to understand which students pre- 
ferred or found more engaging. Lastly, future evaluations could 
be further strengthened by increasing the sample size. 
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